Chi-square Tests Driven Method for Learning the Structure of Factored MDPs

نویسندگان

  • Thomas Degris
  • Olivier Sigaud
  • Pierre-Henri Wuillemin
چکیده

sdyna is a general framework designed to address large stochastic reinforcement learning (rl) problems. Unlike previous model-based methods in Factored mdps (fmdps), it incrementally learns the structure of a rl problem using supervised learning techniques. spiti is an instantiation of sdyna that uses decision trees as factored representations. First, we show that, in structured rl problems, spiti learns the structure of fmdps using Chi-Square tests and performs better than classical tabular model-based methods. Second, we study the generalization property of spiti using a Chi-Square based measure of the accuracy of the model built by spiti.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental Structure Learning in Factored MDPs with Continuous States and Actions

Learning factored transition models of structured environments has been shown to provide significant leverage when computing optimal policies for tasks within those environments. Previous work has focused on learning the structure of factored Markov Decision Processes (MDPs) with finite sets of states and actions. In this work we present an algorithm for online incremental learning of transitio...

متن کامل

Efficient Structure Learning in Factored-State MDPs

We consider the problem of reinforcement learning in factored-state MDPs in the setting in which learning is conducted in one long trial with no resets allowed. We show how to extend existing efficient algorithms that learn the conditional probability tables of dynamic Bayesian networks (DBNs) given their structure to the case in which DBN structure is not known in advance. Our method learns th...

متن کامل

An MCMC Approach to Solving Hybrid Factored MDPs

Hybrid approximate linear programming (HALP) has recently emerged as a promising framework for solving large factored Markov decision processes (MDPs) with discrete and continuous state and action variables. Our work addresses its major computational bottleneck – constraint satisfaction in large structured domains of discrete and continuous variables. We analyze this problem and propose a novel...

متن کامل

Efficient Reinforcement Learning in Factored MDPs

We present a provably efficient and near-optimal algorithm for reinforcement learning in Markov decision processes (MDPs) whose transition model can be factored as a dynamic Bayesian network (DBN). Our algorithm generalizes the recent E3 algorithm of Kearns and Singh, and assumes that we are given both an algorithm for approximate planning, and the graphical structure (but not the parameters) o...

متن کامل

Structure Learning in Ergodic Factored MDPs without Knowledge of the Transition Function's In-Degree

This paper introduces Learn Structure and Exploit RMax (LSE-RMax), a novel model based structure learning algorithm for ergodic factored-state MDPs. Given a planning horizon that satisfies a condition, LSE-RMax provably guarantees a return very close to the optimal return, with a high certainty, without requiring any prior knowledge of the in-degree of the transition function as input. LSE-RMax...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1206.6842  شماره 

صفحات  -

تاریخ انتشار 2006